-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This commit implements the F-beta score metric #1543
This commit implements the F-beta score metric #1543
Conversation
for the AnswerCorrectness class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this is useful. From ragas 0.2 onwards, we have factual correctness score - can you also add this to it?
score = 2 * (precision * recall) / (precision + recall + 1e-8) |
oh thank sr. I'll do it now. |
_factual_correctness, which is a weighted harmonic mean of precision and recall, where the recall is weighted by a factor of beta. The F-beta score is defined as: F-beta = (1 + beta^2) * (precision * recall) / (beta^2 * precision + recall) The F-beta score is a generalization of the F1 score, where beta = 1.0. The F1 score is the harmonic mean of precision and recall, and is defined as: F1 = 2 * (precision * recall) / (precision + recall)
calculation in factual correctness and keeping the f1 score as f1-beta score as requested.
Hey @Yuri-Albuquerque thanks for the change. |
Absolutely, @shahules786, feel free to take over this merge! I completely agree that this function belongs in utils. I'm still learning how to contribute effectively to this project, so I appreciate your guidance. I work at one of the largest private banks in Brazil (Itaú SA), and we're using RAGAS to evaluate various features in our chatbot. The "F1-beta" metric is something we've been wanting for a long time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Yuri-Albuquerque just made the changes from my end. Thanks a lot :)
for the AnswerCorrectness class. The beta parameter is introduced to control the relative importance of recall and precision when calculating the score. Specifically:
Key Changes:
The method _compute_statement_presence is updated to calculate the F-beta score based on true positives (TP), false positives (FP), and false negatives (FN).
This ensures that we can balance between recall and precision, depending on the task's requirements, by tuning the beta value.
source: https://scikit-learn.org/1.5/modules/generated/sklearn.metrics.fbeta_score.html